Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix deadloop if proxy_handle_upstream exit early than proxy_handle_downstream #480

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

taikulawo
Copy link

fix #475

@andrewhavck andrewhavck self-assigned this Nov 23, 2024
ctx,
)
.await?;
downstream_state.maybe_finished(request_done);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We initialize downstream_state with DownstreamStateMachine::new(session.as_mut().is_body_done()); and is_body_done() already checks if the body is empty or done. This downstream_state.maybe_finished(request_done); looks like a no-op to me?

@andrewhavck
Copy link
Contributor

Would you be able to provide example code that reproduces this problem or additional logs from when it occurred? Thanks.

@taikulawo
Copy link
Author

taikulawo commented Nov 24, 2024

It's hard to reproduce, we collect about 17G log to figure out what happen at that points.
I add many debug log to print some variables, It's useless without give you codes, hard to remove log's secret info.

From our limited not completed logs. It's happen in HTTP2 POST, retrying. content-length == 0

session.as_mut().is_body_done() == false
// false || true
buffer.is_some() || session.as_mut().is_body_empty()
// false || true
let end_of_body = end_of_body || data.is_none();

@andrewhavck
Copy link
Contributor

Ok, that’s helpful, I’ll spend some more time looking into this.

@andrewhavck
Copy link
Contributor

It's happen in HTTP2 POST, retrying. content-length == 0

I'm able to reproduce this race condition locally, this was a good hint. The session.as_mut().is_body_done() depends on the receive half of the stream to be EOS in order to be true which is different than http/1.1.

Returning the request_done here works but it's only because data.is_none() evaluates to true in send_body_to_pipe, there's probably a better way to handle this.

@taikulawo
Copy link
Author

taikulawo commented Nov 25, 2024

yes, end of stream has different means than end of body.

I know that's not the best solution but after patch it, it's never happened again in our prod.

you can give some hint how I could change pr.

@andrewhavck
Copy link
Contributor

We'll want to fix is_body_done for h2 to check is_body_empty so it's consistent with the http/1.1 implementation, that makes the proxy logic consistent between protocols.

It's also a good idea in the tx.reserve() select! arm to check if the tx is closed to be defensive. I don't think any changes need to be made in proxy_h2.

This is a bit more involved, we can fix this internally if you don't have time to amend the PR. Thanks again for filing an issue for this.

@taikulawo
Copy link
Author

Yes, we can revert proxy_h2 changes and also we haven't use h2 upstream yet.

@taikulawo taikulawo requested a review from andrewhavck December 6, 2024 03:07
@andrewhavck andrewhavck added the Accepted This change is accepted by us and merged to our internal repo label Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Accepted This change is accepted by us and merged to our internal repo
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: deadloop in proxy_handle_downstream
3 participants